ICTDAT402
Clean and verify data


Application

This unit describes the skills and knowledge required to clean and verify data obtained from a variety of sources. It involves the use of analytics and review to ensure data quality for an organisation is according to industry practices and organisational policies, procedures and protocols.

It applies to data analytics specialists who work within in a broad range of industries and are responsible for processing data sets for a business.

No licensing, legislative or certification requirements apply to this unit at the time of publication.


Elements and Performance Criteria

ELEMENT

PERFORMANCE CRITERIA

Elements describe the essential outcomes.

Performance criteria describe the performance needed to demonstrate achievement of the element.

1. Prepare data sets

1.1 Identify data sets and establish task requirements according to business needs

1.2 Unify data from different sets according to task requirements

1.3 Review data and confirm accuracy of input and restriction to numerical values

2. Review and clean data set

2.1 Identify and remove incorrect data input and formulate data according to task requirements

2.2 Confirm required data set parameter range according to task requirements

2.3 Run analytics and confirm that data set consistency according to task requirements

2.4 Remove any data values that are outside upper and lower threshold of acceptable range

3. Verify data set

3.1 Confirm consistency between digitally entered data and manually entered data

3.2 Identify and review over-writes according to organisational requirements

3.3 Review data set and confirm analytical suitability according to task requirements

3.4 Store data set securely according to organisational procedures, legislative requirements and industry standard practices

3.5 Obtain final task sign off from required personnel

Evidence of Performance

The candidate must demonstrate the ability to complete the tasks outlined in the elements, performance criteria and foundation skills of this unit, including evidence of the ability to:

combine at least two data sets from different sources

confirm accuracy of the two combined data sets.


Evidence of Knowledge

The candidate must be able to demonstrate knowledge to complete the tasks outlined in the elements, performance criteria and foundation skills of this unit, including knowledge of:

legislative requirements relating to data capture and storage, including data protection, security and privacy laws and regulations

organisational policies, procedures and protocols relating to protecting data integrity for:

data accuracy

identification of data over-writes

verifying data security

monitoring data discrepancies between different sources

digital versus manual data entry

monitoring data integrity

identifying where data breaches have occurred

ethical management and governance of data, including determining availability of data and confidentiality of data

compliance requirements and regulations relating to data loss

key components of policies in place for protecting confidential and private business information and intellectual property in data assets, including:

privacy policies

security policies

intellectual property policies

data analytics including feature extraction procedures.


Assessment Conditions

Skills in this unit must be demonstrated in a workplace or simulated environment where the conditions are typical of those in a working environment in this industry.

This includes access to:

information and data sources to inform data analysis

information and telecommunications equipment required to analyse data

industry standards, organisational procedures, and legislative requirements.

Assessors of this unit must satisfy the requirements for assessors in applicable vocational education and training legislation, frameworks and/or standards.


Foundation Skills

This section describes those language, literacy, numeracy and employment skills that are essential to performance but not explicit in the performance criteria.

SKILL

DESCRIPTION

Learning

Modifies behaviour following exposure to new information

Numeracy

Interprets mathematical data and applies interpretation to task outcomes

Completes complex calculations and records mathematical data

Planning and organising

Sequences stages in cleaning and verifying data efficiently and logically

Prioritises tasks and own workload for required outcomes

Problem solving

Identifies and resolves barriers to successful delivery of cyber security infrastructure

Demonstrates an understanding of how to address less predictable problems and initiates standard procedures in response

Self-management

Implements standard procedures and makes decisions for routine tasks

Technology

Uses technology platforms to assist with data analysis


Sectors

Data analytics